Enhanced language modelling with phonologically constrained morphological analysis

نویسندگان

  • Alex Chengyu Fang
  • Mark Huckvale
چکیده

Phonologically constrained morphological analysis (PCMA) is the decomposition of words into their component morphemes conditioned by both orthography and pronunciation. This article describes PCMA and its application in large-vocabulary continuous speech recognition to enhance recognition performance in some tasks. Our experiments, based on the British National Corpus and the LOB Corpus for training data and WSJCAM0 for test data, show clearly that PCMA leads to smaller lexicon size, smaller language models, superior word lattices and a decrease in word error rates. PCMA seems to show most benefit in open-vocabulary tasks, where the productivity of a morph unit lexicon makes a substantial reduction in out-ofvocabulary rates.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Integrated System For Morphological Analysis Of The Slovene Language

The paper presents an integrated environment for morphological analysis of word-forms of the Slovene language. The system consists of a lexicon input and maintenance module, a lexicon output module for accessing lexical word forms, a two-level rule compiler and a two-level morphological analysis/synthesis unit. The basic paradigms and lexical alternations of word forms are handled by the lexico...

متن کامل

Cognate status and cross-script translation priming.

Greek-French bilinguals were tested in three masked priming experiments with Greek primes and French targets. Related primes were the translation equivalents of target words, morphologically related to targets, or phonologically related to targets. In Experiment 1, cognate translation equivalents (phonologically similar translations) showed facilitatory priming, relative to matched phonological...

متن کامل

پارس مورف: تحلیلگر صرفی زبان فارسی

In this paper, the theoretical foundation, the way of implementation and the uses of Pars Morph, a Persian morphological analyzer is introduced. Pars Morph is a rule-based Persian morphological analysis system, which analyzes the internal structure of word in Persian and determines the grammatical category and function of the word parts. Pars Morph being in link with a lexicon covering about 45...

متن کامل

A Comparative Study of English and Persian Advertising Slogans: Linguistic Means through the Sands of Time

This study was a contrastive analysis of the evolution of English and Persian advertising slogans to investigate their similarities/differences in using rhetorical figures, and the evolution in the use of these figures in the slogans of each language. Thus, 800 Persian and English slogans from the last four decades were collected. Lapsanka's framework (2006) including different aspects with som...

متن کامل

A Morphological Parser For Afrikaans

The paper presents an integrated environment for morphological analysis of word-forms of the Slovene language. The system consists of a lexicon input and maintenance module, a lexicon output module for accessing lexical word forms, a two-level rule compiler and a two-level morphological analysis/synthesis unit. The basic paradigms and lexical alternations of word forms are handled by the lexico...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000